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ABSTRACT 

The effect on speech intelligibility was measured for speech where talkers reading Diagnostic Rhyme Test material 
were exposed to 0.7 g whole body vibration to simulate space vehicle launch. Across all talkers, the effect of 
vibration was to degrade the percentage of correctly transcribed words from 83% to 74%. The magnitude of the 
effect of vibration on speech communication varies between individuals, for both talkers and listeners. A “worst 
case” scenario for intelligibility would be the most “sensitive” listener hearing the most “sensitive” talker; one 
participant’s intelligibility was reduced by 26% (97% to 71%) for one of the talkers. 


1. INTRODUCTION 

A set of investigations to characterize the effects of 
whole-body vibration on speech communications and 
possible mitigation approaches has been underway at 
NASA Ames Research Center since 2009, under the 
colloquial name “Vibrovox.” Part I of this paper 
published in 2009, titled “Stimuli recording and speech 
analysis” [1], investigated the effect of 0.5 and 0.7 g 
whole body vibration on speech production of words, 
using a Diagnostic Rhyme Test (DRT) word list as a 
corpus [2, 3]. Six talkers were recorded in that study 
using a specially designed chair and vibration platform. 
The effect of the vibration was a very pronounced vocal 
“shakiness” related to the excitation frequency of 12 Hz, 


with both amplitude and frequency modulation effects 
observed in the spectro graphic and fundamental 
frequency (F 0 ) analyses of the speech. The reader is 
referred to this companion paper for details of the 
stimuli used here, and for details on the hardware and 
methods used for imparting vibration. 

In space flight operations, the intelligibility of radio 
communications between flight deck and ground control 
is of critical concern, particularly during the launch 
phase of flight. Requirements for speech intelligibility, 
including radio communications, are mandated by 
NASA’s Human- Systems Integration Requirements [4] 
to provide a level equivalent to a 90% word 
identification rate. Under conditions of extreme 
acceleration and vibration during launch scenarios, this 
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goal may not be met for air-ground communications due 
to adverse impacts to the vocal production system. 
Effective speech communications are particularly 
necessary during launch since other means of 
communication, such as manual operation of switches 
or keypads, are severely impaired if not impossible. 

This report addresses the relative effects of vibration on 
speech intelligibility for a set of participants listening 
recordings of an unfamiliar set of talkers, with the 
talkers exposed to 12 Hz, 0.7 g (zero-peak) vibration. 
The paradigm is concerned with the communication 
channel from a speaking crewmember (exposed to 
vibration) and a listener in a ground control scenario 
(not exposed to vibration). The talkers in this 
investigation represent the ‘crew member’ portion of 
this communication channel, and the listeners the 
‘ground control’ portion of the channel. 

2. STIMULI 

Nine participants were presented with previously 
recorded material from five of the six talkers in [1], 
recorded under no vibration and 12 Hz, 0.7 g vibration 
conditions. The vibration in the previous study was 
effected by a fixed-base vibrating chair platform used in 
visual display vibration studies [5]. The 192 stimuli 
words of the Diagnostic Rhyme Test (DRT) described 
in [2, 3] were read by the participants from four 
successively placed 10 x 20 inch panels, each 
containing 48 words printed in 36-point Times Roman. 
The direction of acceleration was along the x-axis, i.e. 
from the rear to the front of the body. 

The participants in this current study were seated in a 
soundproof audiometric booth, with stimuli provided at 
approximately 60 dB SPL via circumaural headphones 
(Sennheiser HD 595). The experimental blocks were 
continuous, with stimuli randomized between vibrated 
and non-vibrated speech conditions using custom 
software. 

3. METHODOLOGY 

Two separate study protocols were used to measure 
comparative intelligibility for vibrated and non-vibrated 
speech. 

For the first protocol, participants were given sequential 
aural presentations of a single word from a 384-word 
list that was formed from the DRT speech material. No 


visual reference or training was given for the word list. 
Participants were asked to transcribe the word heard by 
typing it into a computer display. Spelling errors or 
phonetically ambiguous words (e.g., “cheap, cheep”) 
were corrected during post-analysis. A total of 3,456 
responses were gathered from the nine participants. 

For the second protocol, a self-paced, two- alternative 
forced choice test was run using the DRT protocol 
outlined in ANSI S3. 2 [4]. Data were gathered for 1920 
responses from each participant (17,280 responses 
total). Participants selected one of two words that were 
presented visually via a hand-held LED display that also 
served as the response device. 

4. RESULTS 

Analysis of the results of the first test protocol 
(ANOVA for talker and condition independent 
variables) indicated a significant effect of vibration on 
speech intelligibility (F(l,8) = 141.3, p < .001); see 
Figure 1. Across all talkers, the effect of vibration was 
to degrade the percentage of correctly transcribed words 
from 83% to 74%. The average 9% reduction in 
intelligibility ranged from 6-14% amongst different 
listeners. Amongst different talkers, the reduction 
ranged from 5-13% ; see Table I. 



Figure 1. Intelligibility degradation (5 talkers, 9 
listeners) for transcription of words for non-vibration 
and vibration speech stimuli. 
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Overall, it is possible to conclude that the magnitude of 
the effect of vibration on speech communication varies 
between individuals, for both talkers and listeners. A 
“worst case” scenario for intelligibility would be the 
most “sensitive” listener hearing the most “sensitive” 
talker; for example, one participant’s intelligibility was 
reduced by 26% (97% to 71%) for one of the talkers. 
This type of experiment was the most challenging of the 
two protocols, since the DRT words are not presented 
visually to the listener, and the stimuli lacked the 
cognitive contextual cues that might be used when 
hearing a sentence containing the word. 
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Table I. Percentage of incorrectly transcribed words 
across all participants, categorized by talkers tl-t5 under 
non- vibration (n) and vibration (v) conditions. 

A time-accuracy tradeoff analysis indicated the opposite 
of the usual effect, in that there was a negative 
correlation between accuracy (correct word 
transcription) and time to indicate a response. This is 
partially explained by the fact that many words are in 
fact longer in duration when spoken under vibration 
(primarily due to musculoskeletal compensation to 
produce vowels). It also indicates that faster responses 
(for most talkers without vibration) were more accurate, 
while slower responses (characteristic of vibrated 
talkers) yielded less accurate responses. For the talker 
most “sensitive” to vibration, responses took 500 ms 
longer amongst all listeners, with a corresponding 1 1 % 
decrease in accuracy. 

Analysis of the results of the second test protocol 
(paired t-test) also indicated a significant effect of 
vibration on speech intelligibility (t = 17.97, p < .001), 
reducing the number of correctly identified words from 
96% to 94%. The magnitude of the effect of vibration 
was much lower in this experiment due to the two- 
alternative forced choice protocol; the participant only 


had to choose between two words that they could refer 
to visually. In many cases, the carrier information for 
allowing discrimination was contained in the initial 
consonant (e.g., “jaws”, “gauze”). In the analysis 
presented in [1], it was found that the vibration had 
more of an effect on the production of vowels or final 
consonants. This experiment therefore represents a less 
challenging type of intelligibility test, perhaps 
analogous to when a limited vocabulary is known in 
advance to the listener. 

5. DISCUSSION 

Overall, these results indicate that speech intelligibility 
tests as prescribed in current standards for spaceflight 
communication systems may be insufficient for 
potentially realistic vibration scenarios. A significant 
degradation of intelligibility was observed in the results 
of two different test protocols, and differing levels of 
sensitivity were observed for both talkers and listeners. 
The results are not particularly surprising given prior 
research [e.g., 6-8], but the typical model for 

intelligibility standards is usually a talker able to clearly 
enunciate words. Failing to account for talkers under 
significant vibration may impact safety, particularly for 
communication under off-nominal conditions that 
would require departure from a limited vocabulary. 

Intelligibility might be further degraded by the presence 
of constant 3.-G acceleration in conjunction with the 
0.7-g x-axis modulation. This level is predicted as a 
maximum during future spacecraft launch. Informal 
reports from centrifuge studies indicate that the impact 
of such a constant force makes speech very difficult to 
produce; it is expected that further degradation will be 
observed in speech intelligibility experiments using 
stimuli from talkers exposed to both vibration and 
constant g. 

For nominal operational conditions involving vibration, 
it is recommended that speech communication from the 
flight deck should be minimized or set to conform to a 
limited vocabulary, and that methods of processing the 
communication to enhance intelligibility should be 
evaluated. Under off-nominal situations where launch 
vehicle speech communications might be necessary, 
e.g., to describe a specific problem using an unlimited 
vocabulary, the impact of these results may require 
solutions to “correct” the audio signal. One approach 
currently under development by the author involves 
post-processing of the audio signal into “corrected 
speech” to minimize both the frequency and amplitude 
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modulation distortion that results from vibration. 
Verification of this approach would involve conducting 
the same studies described here but using the “corrected 
speech”, to determine if intelligibility can be restored to 
the baseline “no vibration” condition. 
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